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@ A process fbr combining the reeuite of eeveral dassMters. 

@) A process for suocesslUIy combining the rs- 
suite of several classifiers provides a method for 
calculating confidences of each dasslflcc^ 
dedsten for every dasslAer involved. Confi- 
dencee are then combined aooording to the 
Dempster-Shafer Theory of Evidence. InWaRy, 
basic probabHIty asslgnmente for each of the 
classifiers are calculated and used to calaiate 



confidences for each dasslfiar. The confi- 
dences fbr all of the dassffiers are then oon^ 
blned. The oorvMied confidenoee ere then 
used to detennlne a dass fbr the data Input to 
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Technical Field 

Th present Invention is drected to the field of pattern classification and recognition and, more particu- 
larly. Id a mechanism for calculating a confidence of any classification decision. 

s 

Background Art 

Pattern recognition problenra, such as classification of machine or hand-printed character, are currently 
sbhred with sonw eccuracy t>y using traditional classifiers or neural networks of different types. It is easier in 

fo many cases to apply several dassif lets to the same reoognitbn task to improve recognitkm perfonranoe in a 
combined system, Instead of Inventing a new architecture or a feature extractor to achieve the same accuracy. 
However, It is necessary to assign a measure of evidence to a dassif icatton dedston of each dassif ier in a 
system, to be able to combine them. Unfortunately, such assignments dentand numerous approxtmattons, es- 
pecially when the number of dasses Is large. This aeatas oomputattonid difficulties, and can decrease the 

fa quality of the recognltton performance. 

It is known In the art to use the Dempster-Shafer Theory of Evkience as a tool for representing and oonv 
bining measures of evMence. In the existing ert, U.S. Patent No. 5,123,057 uses the Dempster-Shafer Theory 
of EvkJence to calculate a degree of mateh between data events portkms and model parts. Evidence is col- 
lected and processed after preliminary matehing has been performed using the Dempster-Shafer Theory of 

20 EvUenoe. 

Additfonally, U.S. Patent No. 5,077,807 to Bokser describes a method for processing input ftature vectore. 
Again, the '807 patent relates to a preprocessing means for pattern rsoognitfon. So although the prfor art ad- 
dresses the problem of pattern dassif Icatton and recognltton, the existing art does not address the use of the 
Dempster-Shafer Theory of Evklence as a postrecognitton or postprocessing tool to combine results of several 
25 dassifiere. 

It is seen then that It wouM be desirable to have a means for Improving the results of dassif icatton. 
Summary of the Inventten 

30 ' The present inventton improves the results of dassif icatton by successfully combining the results of sev- 
eral dassif iers. The Inventton also Improves the results of dassif icatton by eliminating numerous approxima- 
tions which result in a decreased eccuracy and quality of the dassif toatton and recognitton performance. 

Spedficalhf. the present inventton uses e distance measure between a dassif Ier output vector and a mean 
vector for a subset of tratoing date corresponding to each dass. Using these distences for basic probability 

39 assignments In the framework of the Dempster-Shafer Theory of €vtoenoe. evtoences for all dassif icatton de- 
cistons for each dassif ier can be calculated and combined. 

In accordance with one aspect of the present Inventton, a method fbr combining the results of several das- 
sifiere comprises a series of steps. Basic probability assignnients are calculated for each of the dassif iera 
and used to cdcutate conf toences for each dassifier. The conf kJences for all of the dassifiere are then com- 

40 btoed. Rnally, the combined conf Mences are used to determine a dass for date Input to the dassif iers. 

Accordingly, it is an object of the present invention to provide a mechanism for calcutettog a conf toence 
of any dassif icatton deciston, which can increase the quality of recognitton performance. Other objects and 
advanteges of the Inventton will be apparent from the following descriptton, the accompanying drawings and 
the appended daims. 

45 

Brief Descriptton of the Drawings 

Fig. 1 is block diagram illustrating the combination of several dassifiere; and 

Fig. 2 Is a ftowchart Illustrating the steps employed to achieve the comblnatton of the several dassifiere, 
so as deptoted in Fig. 1. 

Detolled Descriptton of the Preferred Embodinwnte 

The present Invention relates to a mechanism for combining the results of several dassif ieis. Referring 
85 to Fig. 1, a block diagram 10 illustrates the combination of several dassifiere. In Fig. 1. three dsssifiere 12, 
14, and 16 are shown for descriptive purposes. However, the concept of the present inventton will be appltoable 
to a system having any number of dassifiere, from one to N. 

As can be seen in Fig. 1, date from date btock 18 is input to each dassifier 12, 14. and 16. The present 
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invention allows for the detarvnination of the class of data from data block 18:^eing input to the dassif iars 12, 
14, and 16, based on the output of the classifiers to a voter block 20. 

Refeirlng now to Fig. 2 and continuing with Fig. 1 , inside each classifier block 12, 14, and 16, a bask: prob- 
ability aasignnient te calcutated, as indicated by block 22 of Fig. 2. The basic probabflity assignmenta are used 
5 to calculate confidences for each classifier, as indicated at block 24. The oonf Menoes output from each clas- 
sifier block are combined at the voter block 20, as Indicated at btock 26. The voter bk>ck 20 uses the outputs 
of the classifiers as its input, can calculate conf kjence, i.e., evidence, for the output of each classifier, end 
combines these conf idenoes. The conf Menoe or evidence for each classifier is a measure of the correctness 
of the answer that the classifier produces. As indicated at step 28 of Fig. 2, a dasstllcatton is then made, Iderv 
10 tifying the input from the data block 16 for the classifier which is to be recognized, based on the combined 
confidences from the voter bk)ck 20. The output of the vc^er btock 20. then, is the result of the dassif Icatton 
process of the present inventton. The present inventton provides improved results by taking Into account the 
different dedsbns of different dassif iers. 

in a preferred embodiment, the basic probability assignments are calculated using a distance measure In 
15 accordance with the Dempster-Shafer Theory of Evidence. The confidences and the combinations are also 
calculated according to the Dempster-Shafer Theory of Evidence. Using the distance measures for basic protn 
ability assignn>ents in the framework of the Dempster-Shafer Theory of EvUenoe, evklences for all dassif I- 
catkxi decisk>ns for each dassif ier can be calculated and combined. 

In applying the teachings of the present invention, assume 

20 ;* 

to be a eubset of the training data corresponding to a dass k. In addition, assume 
to be a n>ean vector for a set 
^ tor each dassif ier f> and each dass k. Then 
is s reference vector fbr each dass k. and 

<^ = *(^?) 

^ is a distance measure between 
and 

?. 

This distaitoe measure can be used to calculate the basic pfobak)ility assigrvnents of block 22 in Fig. 2, in ao» 
^ cordance with the Dempster-Shafer Theory of Evfctonce. 

According to the Dempster-Shafer Theory of EvkJenoe, consider a frame of discernment 

e=(et..-,e2), 

where 

40 is the hypotheeis that 'a vector 

Is of the dass k'. For any dassif ier ^ and each dass k, a distance measure d^ can represent evidence in sup- 
port of hypothesis 

49 if I sk, and In support of 

^* 

or against 

if lis not equal to k. 

5^ With e as the frame of discernment, 2® denotes the set of all subsets of 6. Af unctton m is called a basic 
probability assignment If m:2^ 

m(0)»O 

10,1), and 
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wher m(A) reprssents the exact belief in tKe hypothesis A. Therefore, if m Is e bssic proibabiiay assignment 
then a function Bei:2^-^0,1) satisfying 

Bei(B]-i:jD<;i) 



is caOed a belief function. 

There Is a one-to-one correspondence between the belief function and the basic probability asslgnmenL 
10 If mi and nh ara basic prabability assignments on e. their combination or orthogonal sum, 

is defined as 
where 

20 



m(0) B 0 



and 



Since there Is the one-to-one correspondence between Bel and m. the orthogonal sum of belief functions 

Bel - fie/fMei^ 

JO Is defined In the obvious way. 

Special lUnds of belief functions are very good at representing evidence. TTiese functions are called simple 
and separable support functions. Bel Is a simple support function if there exists an 

called the Ibcua of Bel. such that Bel(e) » 1 and BeKA) » s. if both conditions. 

35 fsA 

and 

where 8 Is called Bel's degree of support 

Otherwise, B6l(A) « 0. A separable support function Is either a simple support function or an orthogonal sisn 
40 of simple support functions. Separable support functions are very useful when it is desired to combine evi- 
dences from several sources. If Bel is a simple support function with focus 

then m(F) = s, m(0) « 1-s. and m is 0 elsewhere. 

Let F be a focus for two simple support functions with degrees of support Si and Sa. respectively. If 
45 B^^Bel^eBok 

then m(ff^ ■ 1 • (1-SiM1*S2)t nKO) ■ (l-SiMI-Sa). and m is zero eleewhere. 

Knowing these properties of the simple belief function, then d^ can be used as a degree of support for 
the Bel with focus 

®*» 

50 if 1 3 k. Also, d^ are degrees of support for Bel with focus 
if i Is not equal to k. This yields the probability assignments 

55 11% =dJ^^il1^=l-^(l-dJ?)• 



Combining all of the knowledge about focus, the evWence can be obtained fbr dass k and dassif ier n as: 
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Expanding on this equation yields: 

e^(y«)= 



Finally, evidences for an dassif iere nwy be combined aoconling to the Dempeter-Shafer rule to oblah a meaa- 
ure of confidence for each dass k for the Input vector x« 



Since 

le,l«i. 

In this case, then 



Now, a dass m can be assigned to an input vector. 

If 



30 

In accordance with the present invention, there are two almost equivalent best functions for e distance 
measure between vector _^ 

One of these distance measures Is a cosine between a dassif ier output vector and a mean vector for a 
subset of tralnbig data corresponding to each dass. orcos^Ok"), where Ok" Is an angle between 

^ and 

r. 

or 

IfiSl'lyJP 

The other distance measure is a particular function of Euclidean distance between a classifier output vector 
and a mean vector for a subset of training data corresponding to each dass. or of Euclidian distance between 

and 

r. 

ss and: 



5 



£P0621$56A2 



E (l+|E;-y«p) 



Tho present invention provides s sinnple end effective metlKxl for catoulation of besic protMbillty assign- 
mente. whld) permits confidences for individual classification dedsions to be obtained. The process and meth- 
od of the present Invention can be used for the estimation of confidence for several classifiers and the com- 
10 bination of the nBsulte of several dassif ieiB. 

Industrial Applicability and Advantages 

The present invention is useful In the field of pattern classification and recognition and has the advantage 
15 of using the results of different classifiers to improve classification and recognition performance. The present 
invention has the further advantage of ellnrtinating approximations, vyhich can be particularly prohibitive in cas- 
es where the number of classes Is large. This, In turn, inaeases the quality of the classification and recognition 
performanoe. Simple end effedhre calculation of basic probability assignments permits confidences f6r Indi- 
vidual dassif Icstlon decisions to be obtained. The present invention has the further advantage of etiowing for 
20 the estimation of confidence for several dassif iers and the combination of the results of several dassif lers. 

Finally, the present Invention has the advantage of being applicable to any traditional statistical dassif iers 
or neural networlcs of different architectures and based on different sets of features, as weO as to different 
applications in which calculattons of confidence for each classification decision are necessary. 

Having described the invention in detail and by reference to the preferred embodiment thereof, it will be 
25 apparent that other modifications and variattons are possible without departing from the scope of the invention 

defined In the eppended daims. 
The invention is summarized as follows : 

1 . A method for combining results of several classifiers comprises the steps of: 

calculating basic probability assignments for each of the dassif Iers; 
90 using the basic probabimyessignnients to osculate confidenoea for each classifier, 

combining the confidences for ail of the classifiers; and 

using the combined oonf Idenoes to determine e dass for data Input to the dassif iers. 

2. A method eccording to 1 wherein the steps of calculating basic probability assignments and using the 
basic probability assignments to calculate confidences for each classifier comprise the step of applying 

55 a suitable theory of evidence. 

3. A method according to 2 wherein the suitable theory of evidence comprises the Dempster-Shafer Theory 
of Evidence. 

4. A method according to 1 wherein the step of using the basic probabliity assignments to calculala con- 
fidences for each classifier comprises the steps of: 

40 using a distance nieasure between a dassif ler output vector and a mean vector for a subset of trairv 

Ing data corresponding to each dass; and 

calculating evidences for all classification decisions for each dassif ier, using the distanpss as basic 
probability assignments. 

5. A method eccording to 4 wherein the distance measure oornprises one of two ainwst eq 
45 measures. 

6. A method according to 5 wherein the first distance measure comprises a cosine between a classifier 
output vector and a meen vector for e subset of training data corresponding to each dass. 

7. A method according to 5 wherein the second distance measure comprises a function of £ucfidean dis- 
tance between a classifier output vector and a mean vector for a subset of training data corresponding to 

60 each dass. 

8. A method according to 1 wherein the step of combining the confidences comprises the step of combining 
the confidences according to the Dempster-Shafer Theory of Evidence. 

.9. A system for combining the results of several dassif lers comprising: 

means for calculating basic probability assignments for each of the dassif lers; 
55 means for calculating confidences for each dassif Ier from the besic probability assignments; 

means for combining the confidences for all of the dassif lore; and 

means for determining a dass for data input to the dassif iere from the combined confidences. 
10. A system according to 9 wherein the nieans fbr calculating basic prebablllty assignments and oonfi- 
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denc88 for each classifier comprise a suitable theory of evidenca. 

11. A system acoofding to 10 wherein the suitable theory of evidence comprises the Dempster-Shafer 
Theory of Evidence. 

12. A system according to 9 wherein the means for calculating confidences for each classifier oomprises: 
5 a distance measure between a classifier output vector and a mean vector for a subset of training 

data corresponding to each class; and 

means for calculating evidences for all dassif icatbn decisions for each dassifiei; using the disteno- 
es 88 basic probability assignments. 

13. A system according to 12 wherein the distance measure comprises one of two almost equivalent dis- 
10 tance measures. 

14. A system according to 13 wherein the first distance measure comprises a cosine between a classifier 
output vector and a mean vector for a subset of training data corresponding to each dass. 

15. A system according to 13 wheteln the second distance measure comprises a function of €udidean dis- 
tance between a classifier output vector and a mean vector for a subset of training data corresponding to 

15 each dass. 



Claims 

ao 1. A method for combining results of several dassif iers comprises the steps ofc 
calculating basic probability assignments for each of the dassif iero; 
using the basic probability assignments to calculate confidences for each dassif ier; 
combining the confidences for all of the dassif iers; and 

uaing the combined oonf idencea to determine a dass for data input to the dassif iers. 

25 

2. A method as daimed In daim 1 whersin the steps of calculating basic probability assignments and using 
the basic probability assignments to calculato confidences for each classifier comprise the step of ap- 
plying a suitable theory of evidence. 

30 3. A method es daimed in daim 1 wherein the step of using the basic probability assigranents to cdcdato 
confidences for each dassif ier comprises the steps of: 

using a distance nieasure betweena dassif ier output vector and a mean vector fbr a subset of train- 
ing data corresponding to each dasa; and 

calculaUng evidencea for all dassif icatlon decisions for each dassif ier. using the distances as basic 
^ probability assignments. 

- - 4.„ „A methoCas claJ.*^ te dajmjjyt^^ almost equivalent 

distance measures. 

8. A method as daimed In claim 4 wherein the first distance measure comprises a cosine between a da»- 
^ sif Ier output vector and a mean vector for a subset of training data corresponding to each dass. 

6. A system for combining the results of several dassif iers comprising: 

means for calculating basic probabifity assignments for each of the dassif {ers; 
means for calculating confidences for each dassif Ier from the basic probability assignments; 
^ means fbr combining the confidences for all of the dassif iers; and 

means for determining a dass fbr data input to the dassifiers from the combined confidences. 

7. A system as claimed in daim 6 whereb) the means for calculating basic probabilfty assignments and con- 
fidences for each dassif ier comprise a suRable theory of evidence. 

60 

8. A system as daimed In daim 7 wherein the nMans fbr calculating confidences for each dassif ier com- 
prises: 

a distance measure between a dassif Ier output vector and a mean vector fbr a subset of training 
data con-esponding to each dass; and 
55 means for calculating evidences for aD dassif icatfon dedaions for each classifier, using the dis- 

tances as basic probability assignments. 

9. A system as daimed In daim 6 wherein the distence measure oomprises one of two almost equivalent 
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distonoB niMMures. 

Ill A system as claimed in dafm 9 wherein the second distance measure qomprises a function of Eudidean 
distance between a classifier output vector and a mean vector for a subset of training data oofresponding 
toeachdass. 
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